Enhancing motif finding models using multiple sources of genome-wide data

نویسندگان

  • Heejung Shim
  • Oliver Bembom
  • Sündüz Keleş
چکیده

The SUCcESS package implements the CTCM model (particularly logistic regression model) proposed by Shim and Keleş (2008) for integrating quantitative information into motif finding as well as its extension to use multiple data sources at a time (e.g., ChIP-seq, nucleosome occupancy, or conservation score). We implemented them as an extended module for cosmo, developed by Bembom et al. (2007), implementing an algorithm which supervises detection of motifs using a set of constraints for a position weight matrix. Note that although this package provides all functions implemented in cosmo, we ask that you instead use the latest version of cosmo for the supervised motif detection algorithm. This package is for running SUCcESS option only. This vignette just provides the instructions for how to run SUCcESS: you will need to consult Shim and Keleş (2008) for methodological details and the cosmo vignette (you can find it in the document directory of this package) for detailed options.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An integrated approach for genome-wide gene expression analysis

Since efficient and relatively cheap methods were developed for determining biosequences, a lot of biosequence data has been generated. As the main problem in molecular biology is the analysis of the data instead of the data acquisition, part of the study of computational biology is to extract all kinds of meaningful information from the sequences. Computer-assisted methods have become very imp...

متن کامل

Relaxing Haplotype Block Models for Association Testing

The arrival of publicly available genome-wide variation data is creating new opportunities for reconciling model-based methods for associating genotypes and phenotypes with the complexities of real genome data. Such data is particularly valuable for testing the utility of models of conserved haplotype structure to association studies. While there is much interest in "haplotype block" models tha...

متن کامل

Finding the target sites of RNA-binding proteins

RNA-protein interactions differ from DNA-protein interactions because of the central role of RNA secondary structure. Some RNA-binding domains (RBDs) recognize their target sites mainly by their shape and geometry and others are sequence-specific but are sensitive to secondary structure context. A number of small- and large-scale experimental approaches have been developed to measure RNAs assoc...

متن کامل

Bioinformatics Genome-Wide Characterization of the WRKY Gene Family in Sorghum bicolor

The WRKY gene family encodes a large group of transcription factors that regulate genes involved in plant response to biotic and abiotic stresses. Sorghum is a notable grain and forage crop in semi-arid regions because of its unusual tolerance against hot and dry environments. We identified a set of 85 WRKY genes in the S. bicolor genome and classified them into three groups (I–III). Among the ...

متن کامل

TAMO: a flexible, object-oriented framework for analyzing transcriptional regulation using DNA-sequence motifs

SUMMARY TAMO (Tools for Analysis of MOtifs) is an object-oriented computational framework for interpreting transcriptional regulation using DNA-sequence motifs. To simplify the application of multiple motif discovery programs to genome-wide data, TAMO provides a sophisticated motif object with interfaces to several popular programs. In addition, TAMO provides modules for integrating motif analy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011